CHAPTER 21 Summarizing and Graphing Survival Data 305

So, how do you analyze survival data containing censoring? The following sections

explain the correct ways to proceed as well as mistakes to avoid.

Analyzing censored data properly

Statisticians have developed techniques to utilize the partial information con-

tained in censored observations. We describe two of the most popular techniques

later in this chapter, which are the life-table method and the Kaplan-Meier (K-M)

method. To understand these methods, you need to first understand two funda-

mental concepts — hazard and survival

» The hazard rate is the probability of the participant dying in the next small

interval of time, assuming the participant is alive right now.»

» The survival rate is the probability of the participant living for a certain

amount of time after some starting time point.

The first task when analyzing survival data is usually to describe how the hazard

and survival rates vary with time. In this chapter, we show you how to estimate

the hazard and survival rates, summarize them as tables, and display them as

graphs. Most of the larger statistical packages (such as those described in

Chapter 4) allow you to do the calculations we describe automatically, so you may

never have to do them manually. But without first understanding how these

methods work, it’s almost impossible to understand any other aspect of survival

analysis, so we provide a demonstration for instructional purposes.

Making mistakes with censored data

Here are two mistakes you need to avoid when working with survival data:»

» You shouldn’t exclude participants with a censored survival time from any

survival analysis!»

» You shouldn’t substitute the censored date with some other value, which is

called imputing. When you impute numerical data to replace a missing value, it

is common to use the last observed value for that participant (called last

observation carried forward, or LOCF, imputation). However, you should not

impute dates in survival analysis.

Exclusion and imputation don’t work to fix the missingness in censored data. You

can see why in Figure 21-2, where we’ve slid the timelines for all the participants

over to the left as if they all had their surgery on the same date. The time scale

shows survival time in years after surgery instead of chronological time.